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ABSTRACT 


The application of finite element or finite difference techniques to the 
solution of transient heat transfer problems in structures often results in a 
stiff system of ordinary differential equations. Such systems are usually 
handled most efficiently by implicit integration techniques which require the 
solution of large and sparse systems of algebraic equations. Most of the com- 
putation time required for the solution is spent in assembling and solving 
these algebraic equations. The present paper is mainly concerned with effi- 
cient assembly and solution of these systems using the incomplete Cholesky con- 
jugate gradient algorithm. Several examples are used to demonstrate the ad- 
vantage of the algorithm over other techniques. 


INTRODUCTION 


The analysis and design of high speed reentry vehicles such as the space 
shuttle require the prediction and optimization of the thermal-structural be- 
havior. This means that the analyst needs to solve the heat transfer equation 
in a structure with complex boundary conditions, irregular geometries and 
variable thermal properties- The finite element method is one of the more ef- 
fective approaches available for numerical solution of the transient heat trans- 
fer in complex structures. It is therefore expected that finite element sys- 
tems such as SPAR (Ref. 1) will play a growing role in the analysis and design 
of such vehicles. 

The application of the finite element method to transient heat transfer 
problems often results in a system of stiff ordinary differential equations 
(ode's). Stiff ode's are characterized by solutions with widely varying time 
constants. The typical case is when the solution to the homogeneous problem 
has very small time constants compared to those of the forcing function. 

A great deal of effort was devoted in recent years to the development of 
integration techniques that are suitable for the solution of stiff systems of 
ode's. In general, these are variable step size (and, sometimes, variable 
order) implicit techniques such as the Gear algorithms (Refs. 2,3). The ap- 
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plication of such techniques to structural heat transfer problems has been re- 
cently shown to be very efficient (Ref. 4) compared to explicit algorithms. 

The use of implicit integration techniques requires the repeated solution 
of large systems of algebraic equations. Because of radiation effects and tem- 
perature dependent material properties, these equations are nonlinear. These 
nonlinear equations are solved typically by the Newton Raphson method or its 
modified variant which replaces them by systems of linear equations * It is the 
assembly and solution of these large systems of equations which consumes the 
bulk of the computation time in the solution of a transient heat transfer prob- 
lem. This topic is also the focus of this paper. 

The solution of systems of linear algebraic equations can be handled by 
direct methods such as Gaussian elimination or by iterative methods such as 
successive over-relaxation (SOR) . Problems in solid mechanics and structures 
are usually discretized by a finite element method and the associated systems 
of linear equations solved by elimination techniques. On the other hand, in 
fluid mechanics problems, finite difference methods are more common and the as- 
sociated linear equations are solved by iterative techniques. Two reasons for 
the preference of structural analysts for elimination techniques are worth noting. 
The first is the ill conditioning which is typical of the systems of linear 
equations generated by a structural finite element model. This ill condition- 
ing results in very slow convergence rates of iterative solution methods. The 
second reason is the typically good band structure of the system matrices which 
results from the use of one and two dimensional finite element models. This 
property allows efficient solution by elimination using band or skyline solvers. 

In applications to transient heat transfer in structures, the finite 
element codes such as SPAR (Ref. 1) tend to use elimination techniques while 
finite difference (or lumped parameter) codes such as MITAS (Ref. 5) lean to- 
ward the use of iterative techniques. However, neither approach is entirely 
satisfactory, the iterative methods because of poor convergence and elimination 
techniques because of poor performance for wide band systems associated with 
radiation interconnectivities and three dimensional elements. 

A promising new technique which is a cross between elimination techniques 
and iterative techniques is the incomplete Cholesky conjugate gradient (ICCG) 
method developed by Meijerink and Van der Vorst (Ref. 6), and extended by 
Kershaw (Ref. 7) to asymmetric matrices. The method has been successfully ap- 
plied to finite difference modeled transport problems in plasma physics (Refs. 
8-10), to finite difference and finite element modeled boundary value problems 
(Ref. 11-13), and to finite element modeled groundwater flow problems (Ref. 

14). 


The present paper is concerned with the implementation of the ICCG method 
to transient heat transfer problems in structures modeled by finite elements. 
Because of the repeated need to assemble and solve a similar system of 
equations, it is possible to reduce the computational effort by preliminary 
calculations. The ICCG method is compared to a conventional band-matrix elimin- 
ation technique as well as to iterative techniques. A two dimensional space 
shuttle frame model and an insulated cylinder are used to demonstrate the ef- 
ficiency of the method. 
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ANALYSIS 


Numerical Integration Technique 

A discretized transient heat transfer problem is governed by the following 
system of ordinary differential equations (ODE) 


C(T,t)T = Q(T,t) - K(T,t)T T(0) given (1) 


where T is the vector of nodal temperatures, C is the capacitance matrix, K is 
the total conduction matrix (including radiation and convection effects), Q is 
the thermal load vector and a dot indicates derivatives with respect to time, 
t. The dependence of the matrices C and K on time and temperature is due to 
time and temperature dependent material properties and to radiation. However, 
the cost of recalculating these matrices whenever T or t is changed is exor- 
bitant, especially when three dimensional finite elements are involved. To 
alleviate this problem, the integration time is divided into time intervals and 
the material properties assumed to be constant in each time interval. As a 
result, in each time interval the matrix C is constant and the only variable 
part of the matrix K is due to radiation effects. 

The ODE system (1) is most efficiently solved by a variable order, vari- 
able time step algorithm such as employed in the GEAR package (Ref. 3). How- 
ever, in the present work, a simple fixed-step mid-difference (Crank Nicholson) 
algorithm is used. It was shown in Ref. 15 that the performance of the algo- 
rithm is quite satisfactory for the problems solved here. 

Using a numerical integration algorithm, we evaluate the temperature T at 
a sequence of time points t-^, t^ , .... Denoting as T the approximate solution 
for T(t^), the mid-difference algorithm replaces Eq. (l) by 


T -T _ 

^(T ) = 2C " + K(T )T + K(T )T 

n h n n n— 1 n— 1 

n 


- Q(T ,t ) - Q(T ,t ) = 0 
n n n— 1 n 


( 2 ) 


We assume that has already been calculated so that Eq. (2) is a system of 

nonlinear algebraic equations for T^^. This system of equations is solved using 
the Newton Raphson method 


II 
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( 3 ) 


(nri-1) (m) (m) 

j[t . - T ] = -iJ;(T ) 

n n-" n 

(m) 

where T is the m-th Iterate and J is the Jacobian 


T = 

3T 

n 


(4) 


(m) 

If J is not recalculated as a function of T but is kept constant, we have 
the modified Newton method. In the present work J was calculated at the be- 
ginning of each time interval (when material properties are updated) and was 
not updated inside a time interval unless the number of iterations in Eq. (3) 
exceeded three. As was noted before, the only nonlinearity in Eq. (2) is due 
to radiation. In the problems considered herein, only radiation to space was 
considered so that the Jacobian was symmetric and positive definite. 

Solution of Linear Equations 

Eqs. (3) constitutes a system of linear equations of the form 


Ax = b 


(5) 


where in our case, A is symmetric, positive definite and sparse. Eq. (5) may 
be solved by elimination techniques or by iterative methods. Herein, several 
methods of solution were compared. The first is an elimination technique, the 
Gauss-Doolittle factorization, whereby A is factored as 


A = 


T 

LDL 


( 6 ) 


where L is a lower triangular matrix with diagonal elements equal to one and D 
is a diagonal matrix. Once A has been factored the solution process proceeds 
easily. 

Most iterative methods proceed by splitting the matrix A into two parts 


A = M-N 


(7) 


and rewriting Eq. (5) as 


(I - M~^N)x = M~^b 


( 8 ) 
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From Eq. (8), the following simple fixed point iterative process can be defined 


nrfl -1 m 

X = M (b + Nx ) (9) 


It is desirable to choose M so that it is close to A, so that we get 
fast convergence (note that if M=A, N=0, a single iteration is enough). On the 
other hand, we want to choose M so that or its equivalent can be formed 
cheaply. One well known choice is to take M equal to the diagonal of A (the 
Jacobi method) and another is to take M equal to the lower triangular part of 
A (the Gauss Seidel method). Asymptotically, the error in the solution is re- 
duced by a factor equal to the largest eigenvalue of M”^N. For most finite 
element generated matrices, this eigenvalue is very close to one so that con- 
vergence is very slow. 

A method which is a cross between elimination techniques and iterative 
techniques is based on obtaining M from an incomplete elimination process. For 
a positive definite matrix A, the incomplete Cholesky decomposition M is de- 
fined as 


T 

M =: LL 


( 10 ) 


where L is a lower triangular matrix with the same sparsity structure as A 
(that is, no fill-up permitted). The matrix M is a good approximation to the 
matrix A so that most of the eigenvalues of >t1n are very small. Also, for 
a ver^^' sparse matrix A, the cost in computation time and storage for obtaining 
L is much smaller than that required for a complete decomposition. However, 
even though most of the eigenvalues of MT^N are small, it is possible for a few 
to be close to one so that the convergence of the fixed point iteration, Eq. 
(9), is slow. 

Another method which is sometimes used for the solution of linear 
equations is the conjugate gradient method. 

The conjugate gradient method can be used to solve the system (5) by ap- 
plying it to minimize the following error measure 

e = (x-x^)^A(x-x^) (11) 


where x is the exact solution and x is the current approximation. Theoreti- 
cally, the conjugate gradient method should reduce e to zero in no more than n 
iterations so that it is a deterministic method like Gaussian elimination. 
Because of round-off error, however, it does not terminate in, exactly n 
iterations and may be regarded as an itera^tive method. Its convergence, while 
dependent on the ratio of the maximum to the minimum eigenvalues of A, is more 
favorable than those of iterative methods like Gauss-Seidel . This is because 
it tends to eliminate the error components corresponding to extreme eigenvalues 
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in the first few iterations and attain a high convergence rate later. It has a 
decided advantage over the standard iterative methods for matrices with very 
few extreme eigenvalues and a large number of eigenvalues which are bunched 
together. 

Meijerink and Van der Vorst (Ref. 6) put together a very clever combina- 
tion of all the above techniques which can be very efficient for sparse poorly 
banded matrices. The idea is to apply the conjugate gradient method to Eq. (8) 
rather than Eq. (5) where the matrix M is obtained from an incomplete Cholesky 
decomposition of A, Eq. (10). 

Because the matrix M is a good approximation to A, the matrix is 

close to the unit matrix and most of its eigenvalues are close to one. This 
provides a good setting for a very fast convergence of the conjugate gradient 
method. 

To take advantage of the incomplete Cholesky conjugate gradient (ICCG) 
algorithm, sparse matrix storage techniques should be used for the matrix A. 

The method selected here and its implementation are discussed next. 

Matrix Storage , Retrieval and Assembly Technique 

The storage and retrieval technique used herein is due to Gustavson (Ref. 
16) and Tewarson (Ref. 17). One array EJ is used to store by row all the non- 
zero elements of the lower triangular part of the Jacobian. Another array IC 
of the same size contains the column numbers of the entries of EJ. Finally, a 
third array lA stores the position in EJ of the last nonzero entry in each row 
so that lA (i)- lA (i-1) is the total number of nonzero elements in the i-th 
row. A method for generating IC and lA from element data was developed in Ref. 
15. The use of the IC and lA arrays is compatible with an efficient implemen- 
tation of the ICCG algorithm and avoids any operations on zero elements of A or 
L. It is not convenient, however, for assembling the Jacobian from the indivi- 
dual element matrices. 

To expedite the generation and assembly of the Jacobian an additional 
storage system is used. The element conductivity matrices are calculated and 
stored for a unit value of the conduction coefficient. For each element matrix, 
another array IPLACE is generated which stores the destination of each entry of 
the conductivity matrices in the matrix EJ. During assembly the actual conduc- 
tion coefficient is computed based on the average temperature of the element 
and this value is used to multiply the unit matrices. The use of the IPLACE 
array together with the unit matrices reduces the assembly time for the Jaco- 
bian considerably. 

Computer Implementation 

A computer program that implements the analysis methods denoted SMITT 
(sparse Matrix Iterative Techniques for Thermal Analysis) was written for the 
IIT Prime 400 minicomputer (Ref. 15). A parallel program using conventional 
band matrix storage and Gauss-Doolittle solution was also implemented. The 
Prime 400 is a virtual memory machine and theoretically each common block or 
subroutine can have up to 640,000 32 bit words. In practice it was found that 
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beyond 64,000 words the program did not work, so that this limit controlled the 
maximum problem size. These programs were exercised on two example programs 
described in the next section. 


RESULTS AND DISCUSSION 


Shuttle Orbiter Frame 

The first test problem which was used for demonstrating the efficiency of 
the ICCG algorithm is a shuttle orbiter frame. Fig. 1, tested under transient 
heat loads by Gallegos (Ref. 18). The finite element model contains 190 grid 
points and 199 elements, including two dimensional conduction elements and one 
dimensional radiation elements. 

The properties of the aluminum structure and the insulation are functions 
of temperature (Ref. 18). The band width of the Jacobian is 39. 

The problem has been solved for 500 seconds of response time using the ICCG 
and Gauss Doolittle (GD) algorithms. Material properties were updated every 50 
seconds and the integration time step for the Crank-Nicholson method was 25 
seconds. The CPU times required for the solution are tabulated in Table 1. It 
is seen that the ICCG algorithm is about twenty percent faster than Gauss Doo- 
little. This is remarkable because the problem is only mildly sparse with more 
than twenty percent of the elements in the band being nonzero. 

Insulated Cylinder 

For the second test problem, a configuration was sought which was larger 
(in terms of number of unknown temperatures) than the shuttle frame and ex- 
hibited some of the characteristics of an insulated air frame structure. These 
considerations led to the insulated aluminum cylindrical shell depicted in Fig. 

2. The cylinder is 18.3m (720 in.) in length and 2.3 m (90 in.) in diameter. 

The aluminum is 2,5 cm (1 in.) thick and the insulation is 5.0 cm (2.0 in) thick. 
The outer surface of the insulation is heated over a region which consists of 
1/3 the length of half the circumference. The finite element model consists of 
a symmetric half of the cylinder and is composed of solid brick elements (K81 
elements in SPAR). Additionally, the outer surface of the insulation has quad- 
rilateral radiation elements (R41) which radiate to free space. The time-de- 
pendence of the heat load on the cylinder is shown in Fig. 3. In all calcula- 
tions, material properties of the metal and insulation are temperature depen- 
dent and are given in Table 2. The material properties were updated every 
fifty seconds. The number of nodes, the bandwidth of the Jacobian and the 
sparsity can be easily changed by varying the number of axial, radial and cir- 
cumferential elements. Thus the problem is suited for checking the performance 
of the ICCG algorithm vs. other algorithms for a wide range of these parameters. 

The cylinder problem was used to compare the performance of the following 
algorithms? 

(i) Band Gauss-Doolittle elimination 
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(ii) The conventional static over-relaxation (SOR) method with an over-re- 
laxation parameter of 1.4 

(iii) The fixed point iteration given by Eq. (9) with the matrix M given by 
the incomplete Cholesky, Eq. (11) 

(iv) The ICCG algorithm 

Three sizes of problems are considered, with the number of nodes being 
400,720 and 1100 respectively. For each size results are obtained by keeping 
the total number of nodes approximately fixed and varying the band width. The 
total CPU time for calculating 200 seconds of the response is shown in Figure 4 
for 400 nodes and in Figure 5 for 720 nodes (the SOR algorithm failed to con- 
verge for one point which is shown as a break in the curve) . 

For the largest problem the core storage requirements for the Gauss-Doo- 
little elimination could not be met- An explicit formula for predicting the 
computation time for the Gauss-Doolittle algorithm was devised on the basis of 
the available data (see Ref. 15) - This formula was used to estimate the run 
time for that algorithm for 1100 nodes. The other three algorithms were ac- 
tually run for 1100 nodes and the results are given in Figure 6. 

From Figures 4, 5, 6, it is clear that the iterative algorithms are quite 
superior to Gaussian elimination for this problem. The advantage increases with 
increasing number of nodes and increasing band width. The ICCG algorithm is 
superior to the other two iterative algorithms. This was due to an average 
number of iterations of about 2.75 compared to 5 or more for the two other al- 
gorithms. An examination of the detailed run times for each subroutine (Ref. 

15) revealed that most of the gains over the elimination algorithm were due to 
difference in the matrix decomposition. 

An additional advantage of the iterative algorithms over elimination al- 
gorithms results from the insensitivity to band-width. The analyst does not 
have to worry about the best numbering of the nodes to reduce band width. The 
sensitivity to nodal numbering is one of the major disadvantages of implicit 
techniques (when used with Gaussian elimination) compared to explicit integra- 
tion techniques. 


CONCLUDING REMARKS 

The system of ordinary differential equations generated by discretizing a 
transient heat transfer problem in structures is typically stiff. Such systems 
are most efficiently handled by implicit integration techniques which typically 
require the solution of large and sparse systems of linear algebraic equations. 
The generation and solution of these equations account for the major part of 
the computation in the solution of the differential equations. Traditionally, 
these algebraic equations are solved either by elimination techniques or by 
iterative techniques. Herein, it is suggested that a recently developed partial 
elimination algorithm can be more efficient for three dimensional problems 
where the algebraic equations are poorly banded. 
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The partial elimination algorithm - the incomplete Cholesky conjugate 
gradient (ICCG) algorithm is implemented in a finite element program for tran- 
sient heat transfer in structures. It is coupled with a sparse matrix assembly 
and storage systems. The techniques are demonstrated for a two dimensional 
space shuttle frame and an insulated cylinder modeled by three dimensional 
finite elements. The ICCG algorithm was compared to the Gauss-Doolittle elim- 
ination algorithm as well as to two iterative algorithms. For the cylinder 
problem the ICCG algorithm was shown to be greatly superior to the elimination 
algorithm and significantly better than the iterative algorithms. For the 
well banded two dimensional problem, the ICCG algorithm is still marginally 
better. The results indicate that the ICCG algorithm has a potential for large 
saving in computational resources when implemented in computer programs for 
transient heat transfer in structures. 
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Table 1. Solution Time for Shuttle Frame* 



DURATION 
TIME SEC. 

GAUSS-DOOLITTLE 
CPU SEC. 

ICCG 
CPU SEC 


200. 

52.54 

42.95 

500. 

128.92 

106.12 


*Results obtained on a PRIME 400 minicomputer 


Table 2. Material Properties for Insulated Cylinder 


(a) Insulation; p = 160 kg/m^ (0.00582 ibm/in^) 



T 




C 


K 





J/kg-'C 

Btu/lbm-°R 

W/m-°C 

Btu/in-s-°R 

200 


360 

523 

0.125 

0.0381 

5.1 X 10“^ 

367 


660 





.0546 

7,3 

478 


860 





.0711 

9,5 

589 


1060 





.0898 

1.2 X 10 “^ 

700 

- 

1260 

> 



f 

,112 

1.5 

811 


1460 





,142 

1.9 

922 


1660 





,180 

2.4 


(b) Aluminum p = 2770 kg/m^ (0.0101 Ibm/in^) 



T 


C 

K 




J/kg-°C 

Btu/lbm-°R 

W/m-°C 

Btu/in-s-°R 

200 

360 

769 

0.184 

99.5 

0.00133 

311 

560 

861 

.206 

125,0 

,00167 

367 

660 

903 

.216 

138.0 

.00185 

422 

760 

937 

.224 

154.8 

.00207 

478 

860 

974 

.233 

171.3 

.00229 

533 

960 

1012 

.242 

178.8 

.00239 

589 

1060 

1045 

.250 

181.1 

.00242 
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Figure 5.- 
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Figure 6.- Effect of Jacobian band width on total computation time 

for 1100 node cylinder models. 




